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. MURINE GUANINE NUCLEOTIDE EXCHANGE FACTOR - (MNGEF) 
AND HUMAN HOMOLOGUES THEREOF 

?W of the Invention 

The present invention relates to MNGEF> a member of the family of regulators of 
♦s ^ small GTT-bmd:^g^^^ 

Rflrkgmnnd to the invention 

The superfamily of low molecular mass GTP-binding proteins (also known as 
G proteins), for which ras proteins are prototypes, has been implicated in the regulation of 

10 diverse biological activities. In addition to their involvement in regulatihg many aspects of 
growth and differentiation, members of this superfamily play an important role in the 
control of the cytoskeleton and in the regulation of protein trafficking between various 
membrane-bound compartments in the cell. 

These proteins function as binary switches, being 'on' in the GTP-bound state and 

15 ' *off in the GDP-bound state. Cycling between these two forms is controlled by various 
accessory proteins. The guanine nucleotide exchange factors (GEFs), promote the exchange 
of GDP for GTP, thus activating the proteins whereas, the GTPase-activating proteins 
. (GAPS) and GDP-dissociation inhibitory factors (GDIs) are negative modulators. The 
Ras-like proteins are divided into six main families, based on their sequences: Rab, Arf, Sar, 

20 Ran, Rho and Ras. 
, ' Until recently, the Rho GTPases (such as Rac, Rho, Cdc42) were thought to be 
primarily involved in the organisation of the actin cytoskeleton. However, it has become 
evident that they play a critical role in controlling cell proliferation and progress has been 
made in identifying signalling cascades involving the Rho family members. 

25 A family of cell growth regulatory proteins and oncogene products have been 

discovered for which the Dbl oncoprotein is a prototype (Eva and Aaronson (1985) Nature 
316, 273-275). These proteins are putative guanine nucleotide exchange factors for the Rho 
GTPases. They all contain a Dbl homology, domain (DH) in tandem with a pleckstrin 
homology domain (PH), and seem to activate specific members of the Rho family to elicit a 

30 variety of biological functions in the cell. The DH domain is responsible for binding and 
activating the G proteins thus mediating downstream signalling events, whereas the PH 
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domain is thought to play a role in targeting these guanine nucleotide exchange factors to 
specific cellular locations in order to carry out the signalling function. 

Since the initial identification of Dbl as a GEF for Rho GTPases, an increasing 
number of oncogene products and growth regulatory molecules have been shown to contain 
5 those two domains in tandem. Many of them, such as Bcr which is involved in the 
chromosomal rearrangements in chronic myelogenous leukaemia, Cdc24, Ras guanine 
nucleotide release factor and Vav have been implicated in cell growth regulation. Othos, 
including Ect-2, Tim, Ost and Lbc were discovered, by virtue of their transforming 
capability, through gene transfer methods. 

10 

Disclosure of the invention 

H&cCy we report the isolation and preliminary characterisation of 3 overlapping, 
mouse cDNAs (designated MNGEFl, MNGEF2 and MNGEF3), which show homology to 
the TIM gene (Transforming Immortalized Mammary, Chan et al, (1994) Oncogene 9, 

15 1057-1063) of the family of regulators of small GTP-binding protdhs. The homology is 
observed at both the ammo acid and nucleotide levels. However, the size of the transcript 
observed by Northern analysis and the expression pattern of MNGEF2 is maricedly different 
to that of TIM, suggesting that this is a novel, neuronal-specific member of the above family 
of genes. In addition, MNGEFl and MNGEF2 contain a trinucleotide repeat. Together 

20 with the high expression pattern of MNGEF2 in brain, the presence of the triplet repeat and 
the homology to TIM, these cDNAs present potential candidates for disease related genes. 

We also report the cloning and sequencing of a firagment of the human homplogue 
of MNGEF. Substantial homology is observed at both the amino acid and nucleotide levels 
between murine MNGEF and its human homologue NGEF. 

25 The MNGEF3 clone is 1.35 kb and is contained completely within the MNGEFl 

cDNA which is 2.3 kb. MNGEF2 is the longest clone (2.8kb) but contains a 92bp unspUced 
intron within it (fit>m nucleotides 1816 to 1907 of SEQ. ID No. 3), resulting in a premature 
termination codon. MNGEFl does not contain this intron and therefpre its ORF extends 
beyond the stop codon of MNGEF2. From the sequences of MNGEFl and MNGEF2 we 

30 conclude that the cDNA designated MNGEF consists of 2741 bp (2833 bp minus 92 bp) 
which results in an ORF of 554 amino acids. 
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The murine MNGEF cDNA sequence is set out as SEQ. ID No. 1. The amino acid 
sequence of the ORF from nucleotides 343 to 2004 is set out as SEQ. ID No. 2. The murine 
MNGEF2 cDNA sequence, which includes the 92 bp intron, is set out as SEQ ID. No. 3. The 
amino acid sequence of the ORP from nucleotides 343 to 1860 is set out as SEQ. ID No. 4. 
The murine MNGEF 1 cDNA sequence is set out as SEQ. ID No. 5. The amino acid 
sequence of the ORF from nucleotides 2 to 1609 is set out as SEQ. ID No. 6. The partial 
human NGEF cDNA sequence is set out as SEQ. ID No. 7. The amino acid sequmce of the 
ORF from nucleotides 3 to 803 is set out as SEQ ID No. 8. 

Thus the invention provides a murine guanine nucleotide exchange factor designated 
MNGEF, a human homologue thereof designated human NGEF or othd- mammalian 
homologue thereof which guanine nucleotide exchange factor is encoded by a cDNA 
sequence obtainable from a mammalian brain cDNA library, said DN A sequence being 
selectively detectable with a murine DNA sequence as shown in SEQ ID Nos. 1 , 3 or 5 or a 
human DNA sequence as shown in SEQ ID No. 7. 

The protein preferably has one or more of the additional features: 

(1) it comprises a Dbl homology domain having substantial homology to amino 
acids 124 to 306 of SEQ ID No. 2; 

(2) it comprises a pleckstrin homology domain having substantial honiology to 
amino acids 333 to 445 of SEQ ID No. 2; 

(3) it comprises an SH3 domain (Src homology 3 domain) having substantial 
homology to amino acids 456 to 517 of SEQ ID No. 2 

(4) it is found predominantly in neuronal cell types; 

(5) it is encoded by an mRNA of approximately 2.7 kb; 

(6) it promotes the exchange of GDP for GTP by low molecular mass GTP- 
binding proteins; and 

(7) it comprises a polyglutamine region. 

The term "selectively detectable" means that the cDNA used as a probe is used 
under conditions where a target cDNA of the invention is found to hybridize to the probe at 
a level significantly above background. The background hybridization may occur because 
of other cDNAs present in the brain cDNA library. In this event background implies a level 
of signal generated by interaction between the probe and a non-specific cDNA member of 
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the library which is less than 10 fold, preferably less than 100 fold as intense as the specific 
interaction observed with the target cDNA. The intensity of interaction may be measured, 
for example, by radiolabelling the probe, e.g. with "P. Suitable conditions may be found by 
reference to the Examples. 
5 Accordingly, in a first aspect, the invention provides the MNGEF protein of SEQ 

ro. 2, 4, 6" and homologues thereof, polypeptide fragments thereof, as well as ^tibodies 
capable of binding the MNGEF protein or polypeptide fragments thereof. The invention 
also provides the human NGEF protein of SEQ. ID. No. 8 and homologues thereof, 
polypeptide fragments thereof, as well as antibodies capable of binding the human NGEF 

10 protein or polypeptide fragments thereof. Human NGEF proteins, homologues and 
fragments thereof, are also included in references below to polypeptides of the invention. 

In another aspect, the present invention provides a poljmucleotide in substantially 
isolated form capable of hybridising selectively to any one of SEQ ID Nos. 1 , 3, 5 or 7 or to 
the complemmt (i.e. opposite strand) thereof. The present invention also provides a 

15 polynucleotide in substantially isolated form capable of hybridising selectively to any one of 
SEQ ID Nos. 1, 3, S or 7 or to .the complonent (i.e. opposite strand) thereof. Also provided 
are polynucleotides encoding polypeptides of the invention. Such polynucleotides will be 
referred to as a polynucleotide of the invention. A polynucleotide of the invention includes 
DNA of SEQ ID Nos. 1, 3. 5 and fingraents thereof capable of selectively hybridising to the . 

20 gene encoding MNGEF. A polynucleotide of the invention also includes DNA of SEQ ID 
No 7 and fragments thereof capable of selectively hybridising to the gene encoding human 
NGEF. 

In a fiirther aspect, the invention provides recombinant vectors carrying a 
polynucleotide of the invention, including expression vectors, and methods of growing such 
25 vectors in a suitable host cell, for example under conditions in which expression of a protein 
or polypeptide encoded by a sequence of the invention occurs. 

In an additional aspect, the invention provides kits comprising polynucleotides, 
polypeptides or antibodies of the invention and methods of using such kits in diagnosing the 
presence of absence of MNGEF, human NGEF and their homologues, or variants thereof^ 
3 0 including deleterious MNGEF and human NGEF mutants. 
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Pgtaikd description of the invention. 
A, PolynwIcQtidcs. 

In the following description, it should be understood that references to MNGEF refer 
5 additionally to MNGEFUMNGEF2, MNGEF3 and human NGER Polynucleotides of the 
invention may comprise DNA or RNA. They may be single or double stranded. They may 
also be polynucleotides which include within them synthetic or modified nucleotides. A 
number of different types of modification to oligonucleotides are known in the art. These 
include methylphosphonate and phosphorothioate backbones, addition of acridine or 

10 polylysine chains at the 3' and/or 5' ends of the molecule. For the purposes of the present 
invention, it is to be understood that the polynucleotides described herein may be modified 
by. any method available in the art. iSuch modifications may be carried out in order to 
. enhance the in vivo activity or life span of polynucleotides of the invention. 

Polynucleotides of the invention capable of selectively hybridising to the DNA of 

15 SEQ ID No. 1 will be generally at least 70%, preferal)ly at least 80 or 90% and more 
preferably at least 95% homologous to the corresponding DNA of SEQ ID No. 1 over a 
region of at least 20, preferably at least 25 or 30, for instance at least 40, 60 or 100 or more 
contiguous nucleotides. Prefenred polynucleotides of the invention will comprisje regions 
homologous to the DH domain of MNGEF, fi-om nucleotides 712 to 1260 of SEQ ID No. U 

20 preferably at least 80 or 90% and more preferably at least 95% homologous to the DH 
domain of MNGEF. Preferred polynucleotides of the invention will also comprise regions 
homologous to the PH domain of MNGEF, torn nucleotides 1339 to 1677 of SEQ ID No. 
1, preferably at least 80 or 90% and more preferably at least 95% homologous to the PH 
domain of MNGEF. Preferred polynucleotides of the invention will fiuther comprise 

25 regions homologous to the SH3 domain of MNGEF, fi-om nucleotides 1708 to 1893 of SEQ 
ID No 1, preferably at least 80 or 90% and more preferably at least 95% homologous to the 
SH3 domain of MNGEF 

It is to be understood that skilled persons may, using routine techniques, make 
nucleotide substitutions that do not affect the polypeptide sequence encoded by the 

30 polynucleotides of the invention to reflect the codon usage of any particular host organism 
in which the polypeptides of the invention are to be expressed. 
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Any combination of the above mentioned degrees of homology and minimum sizes 
. may be used to define polynucleotides of the invention, with the more stringent 
combinations (i.e. higher homology over longer lengths) being preferred. Thus for example 
a polynucleotide, which is at least 80% homologous over 25, preferably 30 nucleotides 
forms one aspect of the invention, as does a polynucleotide which is at least 90% 
homologous over 40 nucleotides. 

Polynucleotides of the invention may be used to produce a primer, e.g. a PGR 
primer, a primer for an alternative amplification reaction, a probe e.g. labelled with a 
revealing label by conventional means using radioactive or non-radioactive labels, or the 
polynucleotides may be cloned into vectors. Such primers, probes and other fiagments will 
be at least IS, preferably at least 20, for example at least 25, 30 or 40 nucleotides in Iragth, 
and are also encompassed by the term polynucleotides of the invention as used herein. 

Polynucleotides such as a DNA polynucleotide and primers according to the 
invention may be produced recombinantly, synthetically, or by any means available to those 
of skill in the art. They may also be cloned by standard techniques. 

In general, primers will be produced by synthetic means, involving a step wise 
manufacture of the desired nucleic acid sequence one nucleotide at a time. Techniques for 
accomplishing this using automated techniques are readily available in the art. 

Lx)nger polynucleotides will generally be produced using recombinant means, for 
example using a PGR (polymerase chain reaction) cloning techniques. This will involve 
making a pair of primers (e.g. of about 15-30 nucleotides) to a region of the MNGEF gene 
which it is desired to clone, bringing the primers into contact with mRNA or cDNA 
obtained fix)m an animal or human cell (e.g. a brain cell), performing a polymerase chain 
reaction under conditions which bring about amplification of the desired region, isolating 
the amplified Segment (e.g. by purifying the reaction mixture on an agarose gel) and 
recovering the amplified DNA. The primers may be designed to contain suitable restriction 
enzyme recognition sites so that the amplified DNA can be cloned into a suitable cloning 
vector. 

Such techniques may be used to obtain all or part of the MNGEF sequence described 
herein. Genomic clones containing the MNGEF gene and its introris and promoter regions 
may also be obtained in an analogous manner, starting with genomic DNA fiom an animal 
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or human cell, e.g. a brain cell. 

Although in general the techniques mentioned herein are well known in the art, 
reference may be made in particular to Sambrook et a/., Molecular Cloning, A Laboratory 
Manual (1989) and Ausubel e/ a/., Current Protocols in Molecular Biology (1995), John 
Wiley & Sons, Inc. 

Polynucleotides which are not 100% homologous to the sequences of the present 
invention but fall within the scope of the invention can be obtained in a number of ways. 
Other mwine allelic variants of the MNGEF sequence described herein may be obtained for 
example by probing genomic DNA libraries made fiom a range of individuals, for example 
individuals from different populations. In addition, other animal, particularly mammalian 
(e.g. rat or rabbit, more particularly primate), homologues of MNGEF may be obtained and 
such homologues and fragments thereof in general will be capable of selectively hybridising 
to SEQ ID No. 1. Such sequences may be obtairied by probing cDNA libraries made from 
dividing cells or tissues or genomic DNA libraries from oth^ animal species, and probing 
such libraries with probes comprising all or part of SEQ ID. 1 under conditions of medium 
to high stringency (for example 0.03M sodium chloride and 0.03M soditmi citrate at fiom 
about SC'C to about 60°C). Nucleic acid probes comprising all or part of SEQ ID No. 7 
may be used to probe cDNA libraries from primate species, preferably humans, to obtain 
homologues of MNGEF. In particular nucleic acid probes comprising all or part of SEQ ID 
No. 7 may be used to probe cDNA libraries from humans, to obtain the full-length cDNA 
encoding human HGEF or a homologue thereof 

Allelic variants and species homologues may also be obtained using degenerate PGR 
which will use primers designed to target sequences within the variants and homologues 
encoding conserved iamino acid sequences. Conserved sequences can be predicted from 
. aligning the MNGEF amino acid sequoice with that of TIM. The primers will contain one 
or more degenerate positions and will be used at stringency conditions lower than those 
used for cloning sequences with single sequence primers against known sequences. In 
particular, primers can be designed to target the DH, PH and SH3 domains described abqye. 

Alternatively, such polynucleotides may be obtained by site directed mutagenesis of 
the MNGEF sequences or allelic variants thereof This may be useful where for example 
silent codon changes are required to sequences to optimise codon preferences for a 
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particular host cell in which the polynucleotide sequences are being expressed Other 
sequence changes may be desired in order to introduce restriction enzyme recognition sites, 
or to alter the property or function of the polypeptides encoded by the polynucleotides. 
Further changes may be desirable to represent particular coding changes found in MNGEF 
which give rise to mutant MNGEF genes which have lost their regulatory function. Probes 
based on such changes can be used as diagnostic probes to detect such MNGEF mutants. 

The invention further provides double stranded polynucleotides comprising a 
polynucleotide of the invention and its complement. 

Polynucleotides or primers of the invention may carry a revealing label. Suitable 
labels include radioisotopes such as ^^P or ^% enzyme labels, or other protein labels such as 
biotiiL Such labels may be added to polynucleotides or primers of the invention and may be 
detected using by techniques known per se. 

Polynucleotides or primers of the invention or fragments thereof labelled or 
linlabelled may be used by a person skilled in the art in nucleic acid-based tests for detecting 
or sequencing MNGEF and its. homologues in the human or animal body. 

Such tests for detecting generally comprise bringing a biological sample contaiiiing 
DNA or RNA into contact with a probe comprising a polynucleotide or primer of the 
invention under hybridising conditions and detecting any duplex formed between the probe 
and nucleic acid in the sample. Such detection may be achieved using techniques such as 
PGR or by immobilising the probe on a solid support, removing nucleic acid in the sample 
which is not hybridised to the probe, and then detecting nucleic acid which has hybridised to 
the probe. Altematively, the sample nucleic acid may be inimobilised on a solid support, 
and the amoimt of probe boimd to such a support can be detected. Suitable assay methods 
of this any other formats can be found in for example WO89/03891 and WO90/13667. 

Tests for sequencing MNGEF and its homologues include bringing a biological 
sample containing target DNA or RNA mto contact with a probe comprising a 
polynucleotide or primer of the invention under hybridising conditions and determining the 
sequence by, for exaniple the Sanger dideoxy chain termination method (see Sambrook e/ 
al.). 

Such a method generally comprises elongating, in the pres^ce of suitable reagents, 
the primer by synthesis of a strand complementary to the target DNA or RNA and 
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selectively terminating the elongation reaction at one or more of an A, C, G or TAJ residue; 
allowing strand elongation and termination reaction to occur, separating out according to 
size the elongated products to detemiine the sequence of the nucleotides at which selective 
termination has occurred. Suitable reagents include a DNA polymerase enzyme, the 
5 deoxynucleotides dATP, dCTP, dGTP and dTTP, a buffer and ATP. Dideoxynucleotides 
are used for selective termination. 

Tests for detecting or sequencing MNGEF, or its homologue, in a biological sample 
may be used to determine MNGEF sequences within cells in individuals who have, or are 
suspected to have, an altered MNGEF gene sequence, for example within cancer cells 
10 including leukaemia cells and.solid tumours such as breast, ovaiy, lung, colon, pancreas, 
testes, liver, brain, muscle and bone tumours or within cells fiom the nervous system of 
individuals suffering fix)m neurological disorders. 

In addition, the discovery of MNGEF will allow the role of this gene in hereditary 
diseases to be investigated. In general, this will involve establishing the status of MNGEF, 
is or its homologue (e.g. using PGR sequence analysis), in cells derived from animals or 
. humans with, for example, neurological disorders or neoplasms. 

The probes of the invention may conveniently be packaged in the form of a test kit 
in a suitable container. In such kits the probe may be bound to a solid support where the 
assay format for which the kit is designed requires such binding. The kit may also contain 
20 suitable reagents for treating the sample to be probed, hybridising the probe to nucleic acid 
in the sample, control reagents, instructions, and the like. 

The present invention also provides polynucleotides encoding the polypq)tides of 
the invention described below. Because such polynucleotides will be useful as sequences 
for recombinant production of polypeptides of the invention, it is not necessary for them to 
25 be selectively hybridisable to the sequence of any one of SEQ ID Nos. 1, 3, 5 or 7 although 
this will generally be desirable. Otherwise, such polynucleotides may be labelled, used, and 
made as described above if desired. Polypeptides of the invention are described below. 

B. Polypeptides.' 

30 Polypeptides of the invention include polypeptides in substantially isolated form 

which comprise the sequence set but in SEQ ID Nos. 2, 4, 6 or 8. Polypeptides furtho" 
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include variants of such sequences, including naturally occuning allelic variants and 
synthetic variants which are substantially homologous to said polypeptides. In this context, 
substantial homology is regarded as a sequence which has at least 70%, e.g. 80% or 90% 
amino acid homology (identity) over 30 amino acids with the sequence of SEQ ID No. 2. 
. 5 Polypeptides also include other those encoding MNGEF homologues, and variants 

thereof as defined above, from other species including animals such as mammals (e.g. mice, 
rats or rabbits), especially primates, more especially humans. MNGEF homologues include 
human NGEF. 

Polypeptides of the invention also include fragments of the above mentioned full 
lb length polypeptides and variants thereof, including fragments of the sequences set out in 
SEQ ID Nos. 2, 4, 6 or 8. Preferred fragments include those which include an epitope^ 
Suitable fragments will be at least about S, e.g. 10, 12, IS or 20 amino acids in size. 
Polypeptide fragments of the MNGEF and human NGEF proteins and allelic and specie 
variants thereof may contain one or more (e.g. 2, 3, 5, or 10) substitutions, deletions or 
15 insertions, including conserved substitutions. 

Conserved substitutions may be made according to the following table which 
indicate conservative substitutions, where amino acids on the same block in the second 
column and preferably in the same line in the third column may be substituted for each 
other 



ALIPHATIC 


Non-polar 


GAP 






ILV 




Polar - unchaiged 


CSTM 






NQ 




Polar - charged 


D E 






KR 


AROMATIC 




HFWY 


OTHER 




NQDE 



SUBSTITUTE SHEET (RULE 26) 



wo 98/23743 



PCT/GB97/03302 



Epitopes may be determined, for example, by techniques such as peptide scanning 
techniques as described by Geysen fl/, 1986. 

Polypeptides of the invention may be in a substantially isolated form. It will be 
understood that the polypeptide may be mixed with carriers or diluents which will not 
5 interfere with the intended purpose of the polypeptide and still be regarded as substantially 
isolated. A polypeptide of the invention may also be in a substantially purified form, in 
which case it will generally comprise the polypeptide in a preparation in which more than 
90%, e.g. 95%, 98% or 99% of the polypeptide in the preparation is a polypeptide of tlie 
invention. Polypeptides of the invention may be modified for example by the addition of 

10 histidine residues to assist their purification or by the addition of a signal sequence to 
promote their secretion from a cell. 

A polypeptide of the invention may be labelled with a revealing label. The revealing 
label may be any suitable label which allows the polypeptide to be detected. Suitable labels 
include radioisotopes, e.g. enzymes, antibodies, polynucleotides and linkers such as 

15 biotin. Labelled polypeptides of the invention may be used in diagnostic procedures such as 
immunoassays in order to determine the amount of a polypeptide of the invention in a 
saniple. Polypeptides or labelled polypeptides of the invention may also be used in 
serological or cell mediated immune assays for the detection of immune reactivity to said 
polypeptides in animals and humans using standard protocols. 

20 A polypeptide or labelled polypeptide of the invention or fiagment thereof may also 

be fixed to a solid phase, for exarnple the surface of an iiimiunoassay well or dipstick. Such 
labelled and/or immobilised polypeptides may be packaged into kits in a suitable container 
along with suitable reagents, controls, instructions and the like. Such polypeptides and kits 
may be used in methods of detection of antibodies to the MNGEF or human NGEF proteins 

25 or their allelic or species variants by immunoassay. 

Immunoassay methods are well known in the art and will generally comprise: 

(a) providing a polypeptide comprising an epitope bindable by an antibody 
against said protein; 

(b) incubating a biological sample with said polypeptide under conditions which 
30 allow for the formation of an antibody-antigen complex; and 

(c) determining whether antibody-antigen complex comprising said polypq}tide 
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is formed. 

Polypeptides of the invention may be may by synthetic means (e.g. as described by 
Geysen et aL, 1 996) or recombinantly, as described below. 

Particularly preferred polypeptides of the invention include those spanning or within 
the DH, PH or DH3 homology domains or sequences substantially homologous thereto. 
Preferred polypeptides comprise regions showing substantial homology to the DH domain 
of MNGEF represented as amino acids 124 to 306 of SEQ ID No. 2. Preferred polypeptides 
will also comprise regions showing substantial homology to the PH domain of MNGEF 
represented as amino acids 333 to 445 of SEQ ID No, 2. Preferred polypeptides will further 
comprise regions showing substantial homology to the SH3 domain of MNGEF represented 
as amino acids 456 to 517 of SEQ ID No. 2. . Fragments as defined above from this region 
are particularly preferred. The polypeptides and fragments thereof may contain amino acid . 
alterations as defined above. 

Polypeptides of the invention may be used in in vitro or in vivo cell culture systrais 
to study the role of MNGEF, human NGEF and their homologues in disease. For example, 
truncated or modified MNGEF may be introduced into a cell to disrupt the normal fimctions 
which occur in the cell. The polypeptides of the invention may be introduced into tlie cell 
hy in situ expression of the polypeptide from a recombinant expression vector (see below). 
The expression vector optionally carries an inducible promoter to control the expression of 
the polypeptide. 

The use of mammalian host cells is expected to provide for such post-translational 
modifications (e.g. myristolation, glycosylation, truncation, lapidation and tyrosine, serine 
or threonine phosphorylation) as may be needed to confer optimal biological activity on 
recombinant expression products of the invention. Such cell culture systems in which 
polypeptide of the invention are expressed may be used in assay systrais to identify 
candidate substances which interfere with or enhance the fimctions of the polypq)tides of 
the invention in the cell. 

Cy^tors. 

Polynucleotides of the invention can be incorporated into a recombinant replicable 
vector. The vector may be used to replicate the nucleic acid in a compatible host cell. Thus 
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in a further (embodiment, the invention provides a method of making polynucleotides of the 
invention by introducing a polynucleotide of the invention into a replicable vector, 
introducing the vector into a compatible host cell, and growing the host cell under 
conditions which bring about replication of the vector. The vector may be recovered from 
5 the host cell. Suitable host cells are described below in connection with expression vectors. 

Expression Vectors. 

Preferably, a polynucleotide of the invention in a vector is operably Imked to a 
regulatory sequence which is capable of providing for the expression of the coding sequence 

10 by the host cell, i.e. the vector is an expression vector. The term "operably linked" refers to 
a juxtaposition wherein the components described are in a relationship permitting them to 
function in their intended manner. A regulatory sequence "operably linked" to a coding 
sequence is ligated in such a way that expression of the coding sequence is achieved under 
condition compatible with the control sequences. 

15 The term "regulatory sequences" includes promoters and enhancers and other 

expression regulation signals. These may be selected to be compatible with the host cell for 
which the expression vector is designed. For example, yeast regulatory sequences include 
S cerevisiae GAL4 and ADH promoters, S. pombe nmtl and adh promoters. Mammalian 
promoters, such as a-actin promoters, may be used. Mammalian promotiers also include the 

20 metallothionein promoter which can upregulate expression in response to heavy metals such 
as cadmium and is thus an inducible promoter. Tissue-specific promoters, for example 
neuronal cell specific may be used. Viral promoters may also be used, for example the 
Moloney murine leukaemia virus long tenninal repeat (MMLV LTR), the promoter rous 
sarcoma virus (RSV) LTR promoter, the SV40 promoter, the human cytomegalovirus 

25 (CMV) IE promoter, herpes simplex virus promoters or adenovirus promoters. All these 
promoters are readily available in the ait 

Such vectors may be transformed into a suitable host cell as described above to 
provide for expression of a polypeptide of the invention. Thus, in a further aspect the 
invention provides a process for preparing polypeptides according to the invention which 

30 comprises cultivating a host cell transformed or transfected with an expression vector as 
described above under conditions to provide for expression by the vector of a coding 
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sequence encoding the polypeptides, and recovering the expressed polypeptides. 

The vectors may be for example, plasmid, virus or phage vectors provided with an 
origin of replication, optionally a promoter for the expression of the said polynucleotide and 
optionally a regulator of the promoter. The vectors may contain one or more selectable 
5 marker genes, for example an ampicillin resistance gene in the case of a bacterial plasniid or 
a neomycin resistance gene for a mammalian vector. Vectors may be used in vitro^ for 
example for the production of RNA or used to transfect or transform a host cell. The vector 
may also be adapted to be used in vivo, for example iii a method of gene therapy. 

A further embodunent of the invention provides host cells transformed or transfected 
10 with the vectors for the replication and expression of polynucleotides of the invention. The 
cells will be chosen to be compatible with the said vector and may for example be bacterial, 
yeast, insect or mammalian. 

Polynucleotides according to the invention may also be inserted into the vectors 
described above in an antisense orientation in order to provide for the production of 
15 antisense RNA. Antisense RNA or other antisense polynucleotides may also be produced 
by synthetic means. Such antisense polynucleotides may be used in a method of controlling 
the levels of MNGEF or its variants or species homologues. 

P, AntibQdigs, 

. 20 . The invention also provides monoclonal or polyclonal antibodies to polypeptides of 

the invention or fragments thereof. The invention further provides a process for the. 
production of monoclonal or polyclonal antibodies to polypeptides of the invention. 
Monoclonal antibodies may be prepared by conventional hybridoma technology using the 
polypeptides of the invention or peptide fragments th^eof, as immunogens. Polyclonal 

25 antibodies may also be prepared by conventional means which comprise inoculating a host 
animal, for example a rat or a rabbit, with a polypeptide of the invention or peptide fragment 
thereof and recovering immune serum. In ord^ that such antibodies may be made, the 
invention also provides polypeptides of the invention or fragments thereof haptenised to 
another polypeptide for use as immunogens in animals or humans. 

30 For the purposes of this invention, the term "antibody", unless specified to the 

contrary, includes fragments of whole antibodies which retain their binding activity for a 
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tumour target antigm. Such fragments include Fv, F(ab') and F(ab% fragments, as well as 
single chain antibodies. Furthermore, the antibodies and fragnients thereof may be 
humanised antibodies, e.g. as described in EP-A-239400. 

Antibodies may be used in method of detecting polypeptides of the invention 
5 present in biological samples by a method which comprises: 

(a) providing an antibody of the invention; 

(b) incubating a biological sample with said antibody under conditions which 
allow for the formation of an antibody-antigen complex; and 

(c) determining whether antibody-antigen complex comprising said antibody is 
10 . . formed. 

Suitable samples include extracts from brain tissue, both normal and neoplastic. 
Suitable samples may also include extracts from other tissues such as breast, ovary, lung, 
colon, pancreas, testes, liver, muscle and bone tissues or from neoplastic grov/ths derived 
' from such tissues. 

15 Antibodies of the invention may be bound to a solid support and/or packaged into 

kits in a suitable contain^ along with suitable reagents, controls, instructions and the like. 

E. Thgrapgutip usgs 

G-protein mediated signal transduction pathways have been shown to be involved in 
20 the control of cell division and growth. Many of the gene products involved in such 
pathways are proto-oncogenes i.e. they are capable of causing cellular transformation if 
mutated or aberrantly expressed, for example over-expressed. Therefore, mutations in 
MNGEF or its homologues may be a cause of cellular transformation, especially in the case 
of tumours associated with neuronal tissue, more particularly brain tissue. It niay be 
25 possible to treat tumours that arise as a result by restoring normal MNGEF/NGEF function. 
This may be performed by means of gene therapy. Alternatively, it may be possible to raise 
antibodies that recognise specifically, mutated regions of the MNGEF protein, or its human 
homologue, NGEF. Such antibodies could be linked to therapeutic agents which would 
then target specifically cancer cells containing the mutated form of MNGEF/NGEF. 
30 Thus the polypeptides, polynucleotides and antibodies of the invention may be used 

in as compounds for treating neoplasms in animals or humans. Typically the compounds 
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are fonnulated . for clinical administration by mixing them with a pharmaceutically 
acceptable carrier or diluent For example they can be formulated for topical, parenteral, 
intravenous, intramuscular, subcutaneous, intraocular or transdermal administration. 
Preferably, the compound is used in an injectable form. Direct injection into the patient's 
tumour is advantageous because it makes it possible to concentrate the therapeutic effect at 
the level of the affected tissues. It may therefore be mixed >yith any vehicle which is 
pharmaceutically acceptable for an injectable formulation, preferably for a direct injection at 
the site to be treated. The pharmaceutically carrier or diluent may be, for example, sterile or 
isotonic solutions. 

The dose of compound used may be adjusted according to various parameters, 
especially accordmg to the compound used, the age, weight and condition of the patient to 
be treated, the mode of administration used, pathology of the tumour and the required 
clinical regimen. As a guide, the amount of compound administered by injection is suitably 
from 0.01 mg/kg to 30 mg/kg, preferably from 0.1 mg/kg to 10 mg/kg. 

The routes of administratioh and dosages described are intended only as a guide 
since a skilled practitioner will be able to detennine readily the optimum route of 
administration and dosage for any particular patient and condition. 

Compounds to be administered may include polypeptides, nucleic acids or 
antibodies. The nucleic acids may encode polypeptides or they may encode antisense 
constructs that inhibit expression of a cellular gene. Nucleic acids may be administered by, 
for example, lipofection or by viral vectors. For example, the nucleic acid may fonn part of 
a viral vector such as an adenovirus. When viral vectors are used, in general the dose 
administered is between 10* and 10'* pfu/ml, preferably 10* to 10^° pfu/ml. The term pfii 
("plaque forming unit*') corresponds to the infectivity of a virus solution and is detennined . 
by infecting an appropriate cell culture and measuring, generally after 48 hours, the number 
. of plaques of infected cells. The techniques for determining the pfu titre of a viral solution 
are well documented in the literature. 

Any cancer types may be treated by these methods, for example leukaemias, and 
solid tumours such as breast, ovary, lung, colon, pancreas, testes, liver, brain, muscle and 
bone tumour. Preferably, the tumour is a tumour of the nervous system, in particular the 
central nervous system, for example the brairL 
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The observation that MNGEF is expressed predominantly in brain tissue and Oiat 
expression levels vary during foetal brain development (see Example 2) also suggest that 
MNGEF plays a role in neurological fUnction, in particular neurological development. Thus 
it may be possible to diagnose, in particular prenatally, neurological conditions in which 
MNGEF and its human homologues are implicated using the detection methods discussed 
above. It may also be possible to treat such disorders by, in particular, gene therapy. 

Mapping data indicate that MNGEF maps to mouse chromosome 1 within a region 
syntenic to human chromosome 2q. NGEF maps to human chromosome 2 by hybridisation 
to a panel of mono-chroniosomai somatic cell hybrids. A form of the neurological disorder 
dystonia also maps to the long arm of human chromosome 2. Thus, human NGEF may be 
implicated in this disease. Therefore the above-mentioned probes and DNA sequences may 
be used to detect and diagnose dystonia in humans by, for example, determining the 
presence of mutant himian NGEF sequences as described above. Alternatively, the gene 
encoding human NGEF may lie m close proximity to the gene implicated in a form of 
dystonia which maps to the long arm of human chromosome 2. Therefore the above- 
mentioned probes and DNA sequences may be used to detect and diagnose dystonia in 
humans by, for example, genetic linkage analysis using techniques well-known in the art 
including analysis of restriction fragment length polymorphisms associated with the htmian 
NGEF locus. Detection and diagnosis in both cases outlmed above may be carried out pre- 
natally using foetal tissue, or extracts thereof, or post-natally. Detection and diagnosis may 
also be carried out on germline tissue or extracts thereof. 

The following examples illustrate the invention: 

EXAMPLE 1 * Isolation of MNGEF2 and overlapping clones 

MNGEF2 and the overlapping clones were isolated fiom an aduh mouse brain 
cDNA library (Izap Stratagene) cloned into the EcoRI and JOioI site of the vector 
pBluescript KS. 

Approximately, 10^ plaques were screened using a oligonucleotide designated 
M3/6T7 Forward from the M3/6 gene (5'GCAGGAAAGCTGGGCAGCT 3* - SEQ ID 
No. 9). The probe was end-labelled with y-^'P dCTP (3000 Ci/mmol) using Prom^ 
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kinase. The MNGEFl, MNGEF2 and MNGEF3 cDNA clones were isolated fix)m the host 
bacteriophage using a standard in vivo excision protocol. The three inserts were released 
from the vector by digestion with the restriction aizymes EcoRI and XhoL The sizes of the 
MNGEFl, MNGEF2 and MNGEF3 clones were approximately 2.3, 2.8 and 1.35 kb 
5 respectively. • . 

The clones were sequenced using a standard sequencing protocol fcom USB 
(Amersham). The lull length cDNAs were digested using Taql restriction enzyme and the 
resulting fragments were subcloned into the Clal site of the vector pBluescript KS to 
facilitate sequencing. FuUlength sequencing in one direction was obtained by carrying out 
10 sequential walks using insert specific oligonucleotides. Sequence analysis was done using 
the GCG Wisconsin package version 8. 

Results 

Approximately one millioii plaques from an adult mouse, brain cDNA library, were 
15 screened with an oligonucleotide (M3/6T7 Forward) bom the M3/6 cDNA sequence. Five 
positives clones were identified, three of which appeared to be the same transcript of 
varying length. Sequencing of these cDNA clones demonstrated that they showed 
significant homology to TIM, a transforming gene, whose sequence is related to regulators 
of small GTP-binding proteins. 60% homology was observed on the nucleotide level 
20 between the MNGEF2 and TIM. The homology extended oyer the region known as DH 
domain, which plays an important role in mediating cellular transformation! Sequencing 
also revealed that two of these cDNA clones (MNGEFl and MNGEF2) contained the 
following trinucleotide repeat (AGG)8GAG(AGG)3 (SEQ ID No. 10). In addition it was 
observed that the longer of these cDNAs, MNGEF2, contained an extra 92bp sequence, 
25 which was not present in MNGEFl and MNGEF3, although the flanking sequence of the 
region was identical. This 92 bp fragment comprises an unspliced intron which results in a 
premature termination codon as shown in SEQ ID NO. 3. 

EXAMPLE 2 - Expression of MNGEF2 in mouse and human tissues 
30 To determine the pattern of expression of MNGEF, the cDNA clone MNGEF2 was 

hybridised to Northern blots of poly(A)+ RNA derived from a selection of adult mouse 
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tissues and human foetal brain tissues. 
Northern analysis 

RNA was extracted from mouse tissue and poly(A)+ RNA was prepared fix)m 
,5 100 of total RNA using the Dynabeads mRNA purification kit (Dynal). , Northern blots 
were prepared according to Current Protocols in Molecular Biology, with each lane 
containing 2 ng of poly(A)+ RNA. The human foetal brain Northern blot and the mouse . 
foetal . developmental Northern blot were obtained from Clohtech, The blots were 
hybridised at 42*'C m standard formamide buffer and washed to a stringency of O.lxSSC, 
10 0.1%SDS atdS'^C. The blots were visualised by autoradiography after exposure for one or 
twodaysat-TO'^C. 

Results 

The MNGEF2 cDNA clone detected a transcript of approximately 3 kb 
15 predominantly in mouse brain and a faint one of the same size in mouse eye. In addition, a 
shorter transcript (approximately 2.2 kb) of less intensity was seen in the brain. A faint 
slightly larger transcript (about 3.5 kb) was also observed in small intestine and liver. 

Hybridisation of the MNGEF2 cDNA clone to a Northern blot of human brain 
tissues (Clontech), detects a 3 kb transcript expressed predominantly in the caudate nucleus, 
20 but also in the amygdala and the hippocampus. The same sized transcript, albeit much 
fainter, was observed in all the remaining tissues. 

A similar 3 kb transcript was seen when the MNGEF2 cDNA clone was used as a 
probe on a whole mouse embryo developmental Northern (Clontech). The strongest signal 
was observed in day 7 of embryonic developmrat. Weaker signals of the same size were 
25 seenindays 11, 15 and 17. 

EXAMPLE 3 - Partial cloning of human NGEF 

To isolate the human homologue of MNGEF, primers m32bt7f and m32bt3f were 
used to amplify cDNA from human foetal brain. The sequences of the primers used are 
30- shown below: . ; 

3.2AT3F: 5*-CAAGAGAGGCTGGCAGAGGCAC-3' - SEQ ID No. 1 1 
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3.2AT7F: 5*-GGACCAAGTTTGTATCCTTCAC-3* - SEQ ID No. 12 

3,2BT7F: S'-GGACATCTGCTGCAGCTCACG-S' SEQ ID No. 13 

3.2BT3F: 5'-GGAGAGCTCTGCCTCAGATCTG"3* -SEQ ID No. 14 

5 An 803 bp product was amplified and cloned into the pGEMT vector (Prdmega). 

The clone HFB32 was sequenced and the sequence is shown as SEQ ID No. 7. The 
translated protein sequence is shown as SEQ ID No. 8. A compiarison between mouse and 
the humai;! nucleotide sequence indicates 87.8% homology. A comparison between the 
protein sequence of the two species indicates 97% homology. . . . . - 

10 A search of the Yeast Genome database with the DH region of MNGBF showed 

homology to an open reading frame (ORF) from Chromosome XR (figure 6). This ORF 
corrissponds to a yeast protein called R0M2 which is a GDP-GTP exchange protein for 
Rholp containing the DH domains and the pleckstrin domains. The RHOl gene encodes 
a homologue of the mammaUan RhoA small GTP binding protein in yeast. Rholp is 

15 localised at the growth site and required for bud formation. Disruption of R0M2 results 
in a temperature-sensitive growth phenotype. These mutants offer an attractive system to 
study activation of Rho. 
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SEQUENCE LISTING " 

(iii) NUMBER OF SEQUENCES: 8 
(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCC-vCHARASTOR'ISTilCS-. 

(A) LENGTH: 2741 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(ix) FEATURE: 

(A) NAME/KEY:. CDS 

(B) LOCATION: 343.. 2004 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

GCGCTCTACA GdftGCGGCGG C6GCAGCTCC GGCHGAGCC GCGCGCGCTG CGACCTCACT 60 

CAGAGCCCGC GCAHGCCCC CGGCTGGGCC CTGGGCCCCG CGCGGCTCCC CACCAGCCCC 120 

TGAGCCTACC CGGTCGCT66 TCCCCATG6A GCTGCT6GCT GCAGCCHCA GCGCCGCCTG. 180 

CGCCGTGGAC CACGACAGCT CCACCTCGGA GAGCGACACG CGCGACTCGG C6GCGGGACA 240 

CCTGCCGGGC AGCGAGTCAT CCTCGACCCC TGGAAATGGA ACCACACCCG AGGAGTGCCC 300 

AGCCCTCACC GACA6CCCCA CCACTCTCAC GGAGCCCTGC AG ATG ATC CAT CCC 354 

• Met He His Pro 

AH CCC GCC GAC TCC TGG AGA AAC CTC AH GAA CAA ATA GGG CTC CTG 402 
He Pro Ala Asp Ser Trp Arg Asn Leu He Glu Gin He Gly Leu Leu 
5 . . 10 15 20 

TAT CAA GAG TAT AGA GAC AAA TCG ACT CTC CAA GAA AH GAA ACA CGG 450 
Tyr Gin Glu Tyr Arg Asp Lys Ser Thr Leu Gin Glii He Glu Thr Arg 
25 30 35 

AGG CAG CAG GAT GCA GAA ATC CAA GGC AAC TCC GAT GGG TCC CAG GJT . 498 
. Arg Gin Gin Asp Ala Glu He Gin Gly Asn Ser Asp Gly Ser Gin Val - 
40 45 50 

GGG GAA GAC GCT G6A GAG GAG GAG GAG GAG GAG GAG GAG GGA GAG GAG 546 
Gly Glu Asp Ala Gly Glu. Glu Glu Glu Glu Glu Glu Glu Gly Glu Glu 
55 60 65 
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GAG GAG CTG GCC AGC CCT CCT GAG AGG AGA GCT CTG CCT CAG ATC TGC 594 
61u Glu Leu Ala Ser Pro Pro 61u Arg Arg Ala Leu Pro Gin He Cys 
70 75 80 

CTG CTC AGC AAC CCC CAC TCC AGG HC AAC CTC TGG CAA GAC CH CCT . 642 
Leu Leu Ser Asn Pro His Ser Arg Phe Asn Leu trp Gin Asp Leu Pro 
85 .90-95 100 



GAG ATC CAG AGC AGT GGC GT6 CTG GAC ATT CTC CAG CCG GAG GAG ATC 
Glu lie Gin Ser Ser Gly Val Leu Asp He Leu Gin Pro Glu Glu He 
105 110 115 



690 



AGG CTG CAG GAG GCC ATG TTT GAG TTG GH ACC TCI GAG GCC TCC TAC 738 
Arg Leu Gin Glu Ala Met Phe Glu Leu Val Thr Ser Glu Ala Ser Tyr 

120 125 130 . 

TAT AAG AGC CTG AAC CTG CTG GTG TC6 CAC TTC ATG GAG AAC GAG CGT 786 
Tyr Lys Ser Leu Asn Leu Leu Val Ser His Phe Met Glu Asn Glu Arg 
135 140 145 

CTG AAG AAG ATC CTG CAT CCA TCT GAG GCC CAC ATC CTC TTT TCC AAT 834 
Leu Lys Lys lie Leu His Pro Ser Glu Ala His He Leu Phe Ser Asn 
150 155 160 

GTC CTG GAT GTC ATG GCT GTC AGT GAG CGG TTT HG CTG GAG CTA GAG 882 
Val Leu Asp Val Met Ala Val Ser Glu Arg Phe Leu Leu Glu Leu Glu 
165 170 175 180 



CAC GGC ATG GAG GAG AAC ATT GIT ATC TC6 GAT GTG TGC GAC ATC GTG 
His Arg Met Glu Glu Asn He Val He Ser Asp Val Cys Asp He Val 
185. 190 . 195 



930 



TAC CGT TAC GCA GCT GAT CAC HC TCG GTC TAT ATC ACT TAC GTC AGT -978 
Tyr Arg Tyr Ala Ala Asp His Phe Ser Val Tyr He Thr Tyr Val Ser 
200 205 .210 

AAC CAG ACC TAC CAG GAA AGG ACA TAC AAG CAG CTC CTA CAG GAG AAiS 1026 
Asn Gin Thr Tyr Gin Glu Arg Thr Tyr Lys Gin Leu Leu Gin Glu Lys 
215 220 225 



GCC GCT TTC CGG GAA CTG ATC GCA CAG TTG GAG CTG GAC CCC AAA TGC 
Ala Ala Phe Arg Glu Leu He Ala Gin Leii Glu Leu Asp Pro Lys Cys 
230 235 240 



1074 



AAG GGC CTG CCT TTC TCC TCC HCCTC ATC HG CCT UC CAG AGG ATC 
Lys Gly Leu Pro Phe Ser Ser Phe Leu He Leu Pro Phe Gin Arg He 
245 . 250 255 260 



1122 
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ACG AGA etc AAG CT6 CTG GTC CAG AAT ATC aC AAG AGA GT6 GAG GAG 1170 
thr Arg Leu Lys Leu Leu Val Gin Asn He Leu Lys Arg Val Glu Glu 
265 270 275 

AGG TCT GAA CGT GAA GGC ACC GCC HG GAT GCC CAC AAG GAG CTA GAA 1218 
Arg Ser Glu Arg Glu Gly Thr Ala Leu Asp Ala His Lys Glu Leu Glu 
' ■ ' 280 285 ■ JSO 



AT6 GTG GTA AAG GCA TGC AAT GAG GGT GTC CGG AAG' ATG AGC CGC ACA 
Met Val Val Lys Ala Cys Asn Glu Gly Val Arg Lys Met Ser Arg Thr 
295 ,300 305 



1266 



GAA CAG ATG ATC AGC ATT CAG AAG AAG ATG GAG TTC AAG ATC AAG TCG 
Glu Gin Met He Ser He Gin Lys Lys Met Glu Phe Lys He Lys Ser 
310 315 320 



1314 



GTA CCC ATC ATC TCA CAC TCC CGG TG6 CTG CTG AAG CAG GGT. GAG CTG 
Val Pro He He Ser His Ser Arg Trp Leii Leu Lys Gin Gly Glu Leu 
325 330 335 340 



1362 



CAG CAG ATG TCC GGC CCC AAG ACC TCC CGC ACC CTG CGG ACC AAG AAG 
Gin Gin Met Ser Gly Pro Lys Thr Ser Arg Thr Leu Arg Thr Lys Lys 
.345 350 355 



1410 



CTC nC AGA GAA ATT TAC CTC TTC CTC HC AAT GAC CTG CTG GTG ATC 
Leu Phe Arg Glu He Tyr Leu Phe Leu Phe Asn Asp Leu Leu Val lie 
360 . 365 370 



1458 



TGC CGG CAG ATC CCT GGA GAC AAG TAC CAG GTG TTT GAT TCG GCC CCA 1506 
Cys Arg Gin He Pro Gly Asp Lys Tyr Gin- Val Phe Asp Ser Ala Pro' 
375. 380 385 

AGG GGC CTG CTT CGA GTG GAG GAG CTG GAG GAC CAG GGT CAA ACA CTG 1554 
Arg Gly Leu Leu Arg Val Glu Glu Leu Glu Asp Glh Gly Gin Thr Leu 
390 395 400 



GQ.AAT GTG JJC ATC CTG CGG CTG CTG GAA AAT GCA GAT GAC CGA GAG 
Ala Asn Val Phe He Leu Arg Leu Leu Glu Asn Ala Asp Asp Arg Glu 
405 410 415 420 



1602 



GCC ACC TAT ATG CTG AAG GCA TCC TCC CAG AGC GAG ATG AAG CGC TGG 1650 
Ala Thr Tyr Met Leu Lys Ala Ser Ser Gin Ser Glu Met Lys Arg Trp 
425 430 435 

ATG ACC TCA CTG GCC CCC AAC AGG AGG ACC AAG TTT GTA TCC TTC ACA 1698 
Met Thr Ser Leu Ala Pro Asn Arg Arg Thr Lys Phe Val Ser Phe Thr 
440 445 450' 
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Ta CGG CTG TTG 6AC TGT CCC CAG GTC CAG TGT GTG CAC CCG TAT 6TG 1746 
Ser Arg Leu Leu. Asp Cys Pro Gin Val Gin Cys Val His Pro Tyr Val 
455 ,460 . 465 

GCC CAG CAG CCT GAT GAA CTG ACG CTG GAA CTG GCA GAT ATC CTG AAC 1794 
Ala Gin Gin Pro Asp Glu Leu Thr Leu Glu Leu Ala Asp He Leu Asn 
470 ,475 . 480 

ATC CTG GAG AA6 AGA GAG GAT GGG TGG ATC TTT GGT GAG CGG CTG CAT 1842 
He Leu Glu Lys Thr Glu Asp Gly Trp lie Phe Gly Glu Arg Leu His , 
485 ,490 495 . 500 

GAC CAG GAG AGA GGC TGG HC CCC AGT TCC ATG ACA GAG GAG ATC CTG 1890, 
Asp Gin Glu Arg Gly Trp Phe Pro Ser Ser Met Thr Glu Glu He Leu 
505 .510 515 

AAC CCC AAG ATC CGC TCC CAG AAC CTC AAG GAA TGT TTC CGG GTA CAT 1938 
Asn Pro Lys He Arg Ser Gin Asn Leu Lys Glu Cys Phe Arg Val. His 

■ 520 525 530 ... . 

AAG ATG GAA GAC CCT CAG CGC AGC CAG AAT AAG GAC CGC AGG AAG CTG 1986 . 
Lys Met Glu Asp Pro Gin Arg Ser Gin Asn Lys Asp Arg Arg Lys Leu 
535 540 545 

GGC AGC CGG AAT CGT GAA TGAACCTCCC CAGCTCAGGC. ACCTGAAGGG 2034 
Gly Ser Arg Asn Arg Gin 
550 



AAGG6TGTGG 


GCAGGGATG6 


GGAGCAGGCC 


CGGCA6AGAC 


6CCCGACAGA 


nCAGAGGGC 


2094 


CTTAGGGAAG 


AATGTCAGTG 


CCTTCTCAGG 


CAGCAGGAGT 


GGCnCGGCC 


TGCTCTGTCC 


2154 


CTGCCCATGC 


TGTGGAAGCT 


CTAGTGTCCT 


QGCCACTTGT 


TTGCTTGCAC 


ACTGGTGAAA 


2214 


AGCTAAGTAC 


TTAGGC/ySTA 


HACACCACC 


TCCCnCAGT 


CTCTCAGAGG 


TAGAAGAAGG 


2274 


CAGGCATGCT 


CCAGAGACCT 


TCCGGTGACT 


GGAAGAGGCC 


CACACAAGGG 


TCCCTGGCAG 


2334 


CAGGCAGGTG 


GAAG6TAACC 


ACTGTCAGGA 


TCCCCT6AAC 


TGCACGTGTC 


CTTCCCTACT 


2394 


TTGGAAGCTG 


HAAGAGTCT 


ACCAGGCACA 


CAGATG6CC6 


CCCCTGCCCG 


AGGGAfiTTTG 


2454 


ATGA6CAGTG 


GTGACCCTGC 


CTGCCCGTCC 


CCGTGCCTCT 


GCCAGCCTCT 


CTT6CACGCC 


2514 


AAGCCCTGCC 


CTCAGCAGGC 


nCCCAAAGC 


TTAGCTGAGG 


GHCATGCCA 


CCTCTAGaC 


2574 


CHGAAGGGC 


TTGATATCAC 


HGTGTaCC 


TGGGCCCCTG 


ATGGAGCCCA 


QGCGTTTTGC 


2634 


AGAATGAAH 


GGTCACTGCA 


TCCTHATGG 


TCATG6TTTT 


GAGAAAA6CA 


AATATCATTT 


2694 
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nGGCTGCAT TAAAAGAAGC ATCCTATATA AAAAAAAAAA AAAAAAA 2741 



(2) INFORMATION FOR SEQ lO NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
: - (A)"i£IJ£TH-.- 554'amlWo;'dc*idii 

(B) TYPE: amino add " 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

Met He His Pro lie Pro Ala Asp Ser Trp Arg Asn Leu lie Glu Gin 
1 . 5 • . 10 15 . 

He Gly Leu Leu Tyr Gin Glu Tyr Arg Asp Lys Ser Thr Leu Gin Glu 
20 25 30 

He Glu Thr Arg Arg Gin Gin Asp Ala Glu He Gin Gly Asn Ser Asp 
35 40 45 

Gly Ser Gin Val Gly Glu Asp Ala Gly Glu Glu Glu. Glu Glu Glu Glu 
50 55 60 

Glu Gly Glu Glu Glu Glu Leu Ala Ser Pro Pro Glu Arg Arg Ala Leu 
65 70 75 80 

Pro Gin He Cys Leu Leu Ser Asn Pro His Ser Arg Phe Asn Leu Trp 
85 90 95 

Gin Asp Leu Pro Glu lie Gin Ser Ser Gly Val Leu Asp He Leu Gin 
100 105 110 

Pro Glu Glu He Arg Leu Gin Glu Ala Met Phe Glu Leu ValThr Ser 
115 120 125 

Glu Ala Ser Tyr Tyr Lys Ser Leu Asn Leu Leu Val Ser His Phe Met 
130 135 140 

Glu Asn Glu Arg Leu Lys Lys lie Leu His Pro Ser Glu Ala His He 
145 . 150 155 160 

Leu Phe Ser Asn Val Leu Asp Val Met Ala Val Ser Glu Arg Phe Leu 
. 165 170 175 

Leu Glu Leu Glu His Arg Met Glu Glu Asn He Val He Ser Asp Val 
180 .185 190 



SUBSTITUTE SHEET (RULE 26) 



wo 98/23743 



PCT/GB97/03302 



26- 



Cys Asp He Val Tyr.Arg Tyr Ala Ala Asp. His Phe Ser Val Tyr He 
195 200 205 . 

Thr Tyr Val Ser Asn Gin Thr Tyr 61n Glu Arg Thr Tyr Lys Gin Leu 
210 215 220 

Leu Gin iSlu Lys Alb AY'd Phe Arg Glu Leu He Ala Gin Leu Glu Leu 
225 230 235 240 

Asp Pro Lys Cys Lys Gly Leu .Pro Phe Ser Ser Phe Leu He Leu Pro 
245 250 255 

Phe Gin Arg He Thr Arg Leu Lys Leu Leu Val Gin Asn He Leu Lys 
260. 265. 270 

Arg Val Glu Glu Arg Ser Glu Arg Glu Gly Thr Ala Leu Asp Ala His 
275 280 285 

Lys Glu Leu Glu Met Val Val Lys Ala Cys Asn Glu Gly Val Arg Lys. 
290 295 300 

Met Ser Arg Thr Glu Gin Met He Ser He Gin Lys Lys Met Glu Phe 
305 310 315 .320 

Lys He Lys Ser Val Pro He He Ser His Ser Arg Trp Leu Leu Lys 
325 330 . - 335 

Gin Gly Glu Leu Gin Gin Met Ser Gly Pro Lys Thr Ser Arg Thr Leu 
. 340 345 350 

Arg Thr Lys Lys Leu Phe Arg Glu He Tyr Leu Phe Leu Phe Asn Asp. 
' 355 360 .365 

Leu Leu Val He Cys Arg Gin He Pro Gly Asp. Lys Tyr Gin Val Phe 
370 375 380 

Asp Ser Ala Pro Arg Gly Leu Leu Arg Val Glu Glu Leu Glu Asp Gin 
385 390 .395 400 

Gly Gin Thr Leu Ala Asn Val Phe He Leu Arg Leu Leu Glu Asn Ala 
405 410 415 

Asp Asp Arg Glu Ala Thr Tyr Met Leu Lys Ala Ser Ser Gin Ser Glu 
420 425 430 

Met Lys Arg Trp Met Thr Ser Leu Ala Pro Asn Arg Arg Thr Lys Phe 
435 440 445 



Val Ser Phe Thr Ser Arg Leu Leu Asp Cys Pro Gin Val Gin Cys Val 
450 455 460 
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His Pro Tyr Val Ala Gin Gin Pro, Asp Glu Leu Thr Leu Glu Leu Ala 
465 470 . 475 . 480 

Asp He Leu Asn He Leu Glu Lys Thr Glu Asp Gly Trp He Phe Gly 
485 490 495 

Glu Arg Leu His Asp Gin Glu Arg Gly Trp Phe Pro Ser Ser Met Thr 
500 505 510 

Glu Glu He Leu Asn Pro Lys He Arg Ser Gin Asn Leu Lys Glu Cys 
515 520 525 

Phe Arg Val His Lys Met Glu Asp Pro Gin Arg Ser Gin Asn Lys Asp 
• 530 535 540 

Arg Arg Lys Leu Gly Ser Arg Asn Arg Gin ' 
545 550 



(2) INFORMATION FOR SEQ ID NO: 3: . 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2833 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 
. (D) TOPOLOGY:, linear. 

. (ii) MOLECULE TYPE: cDNA • 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 343.. 1860 

(xi) SEQUENCE DESCRIPTION:. SEQ ID NO: 3: 

GCGCTCTACA GCAGCGGCG6 CGGCAGCTCC GGCHGAGCC GCGCGCGCTG CGACCTCACT 60 

CAGAGCCCGC GCATTGCCCC CGGCTGGGCC CTGGGCCCCG CGCGGCTCCC CACCAGCCCC 120 

T6AGCCTACC CGGTCGCTGG TCCCCATGGA GCTGaGGCT GCAGCCHCA GCGCCGCCTG 180 

CGCCGTGGAC CACGACAGCT CCACCTCGGA GAGCGACACG CGCGACTCGG CGGCGGGACA 240 

CCTGCC6GGC AGCGAGTCAT CCTCCACCCC TGGAAATGGA ACCACACCCG AGGAGTGCCC 300 

AGCCCTCACC GACAGCCCCA CCACTQCAC G6AGCCCTGC AG ATG ATC CAT CCC 354 

Met He His Pro 
1 
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ATT CCC GCC GAC TCC TGG AGA AAG CTC AH GAA CAA ATA GGG CTC CTG " 402 
He Pro Ala Asp Ser Trp Arg Asn Leu He Glu Gin lie Gly Leu Leu 
5 . 10 . 15 20 

TAT CAA GAG TAT AGA GAC AAA TCG ACT CTC CAA GAA AH GAA ACA CGG . 450 
Tyr Gin Glu Tyr Arg Asp Lys Ser Thr Leu Gin Glu He Glu Thr Arg 
• 85 -30 . 35 . 

AGG CAG CAG GAT GCA GAA ATC CAA GGC AAC TCC GAT GGG TCC CAG GH 498 
Arg Gin Gin Asp Ala Glu He Gin Gly Asn Ser Asp Gly Ser Gin Val . 
40 45 50 

GGG GAA GAC GCT GGA GAG GAG GAG GAG GAG GAG GAG GAG GGA GAG GAG 546 
Gly Glu Asp Ala Gly Glu Glu Glu Glu Glu Glu Glu Glu Gly Glu Glu 
55 60 65 

GAG GAG CTG GCC A6C CCT CCT GAG AGG AGA GCT CTG CCT CAG ATC TGC 594 
Glu Glu Leu Ala Ser Pro Pro Glu Arg Arg Ala Leu Pro Gin He Cys 
70 75 80 

CTG CTC AGC AAC CCC CAC TCC AGG JTC AAC CTC TGG CAA GAC CH CCT 642 
Leu Leu Ser Asn Pro His Ser Arg Phe Asn Leu Trp Gin Asp Leu Pro 
85 90 95 100 

GAG ATC CAG AGC AGT GGC GTG CTG GAC ATT CTC CAG CCG GAG GAG ATC 690 . 

Glu He Gin Ser Ser Gly Val Leu Asp He Leu Gin Pro Glu Glu He 
105 110 .115 

AGG GTG CAG GAG GCC AtC TTT GAG HG GH ACC TCt GAG GCC TCC TAC 738 
Arg Leu Gin Glu Ala Met Phe Glu Leu Val Thr Ser Glu Ala Ser Tyr 
. 120 125 .130 

TAT AAG AGC CTG AAC CTG CTG GTG TCG CAC HC ATG GAG AAC GAG OGT 786 
Tyr Lys Ser Leu Asn Leu Leu Val Ser His Phe Met Glu Asn Glu Arg 
135 140 145 . 

CTG AAG AAG ATC CTG CAT CCA TCT GAG GCC CAC ATC CTC TTT TCC AAT 834 
Leu Lys Lys He Leu His Pro Ser Glu Ala 41is He Leu Phe Ser Asn . 
150 155 160 

GTC CTG GAT GTC ATG GCT 6TC AGT GAG CGG TTT TTG CTG GAG CTA GAG 882 
Val Leu Asp Val Met Ala Val Ser Glu Arg Phe Leu Leu Glu Leu Glu 
165 170 175 180 

CAC CGC ATG GAG GAG AAC AH GTT ATC TCG GAT GTG TGC GAC ATC GTG 930 
His Arg Met Glu Glu Asn He Val He Ser Asp Val Cys Asp He Val 
185 190 196 



SUBSTITUTE SHEET (RULE 26) 



wo 98/23743 



PCT/GB97/03302 



-29- 



TAC CGT TAC GCA GCT GAT CAC TTC TC6 GTC TAT ATC ACT TAC GTC AGT 
Tyr Arg Tyr Ala Ala Asp His Phe Ser Val Tyr He Thr Tyr Val Sen 
200 '205 210 



978 



. AAC CAG ACC TAC CAG GAA AGG ACA TAC AAG CAG CTC CTA CAG GAG AAG 
Asn Gin Thr Tyr Gin Glu Arg Thr Tyr L^s Gin Leu Leu Gin Glu Lys 



,216 



.220 



••325 



GCC GCT nC CGG GAA CTG ATC GCA CAG HG GAG CTG GAG CCC AAA TGC 
Ala Ala, Phe. Arg Glu Leu He Ala Gin Leu Glu Leu Asp Pro Lys Cys 
230 235 240 

AAG GGC CTG CCT TTC TCC TCC TTC CTC ATC HG CCT HC CAG AGG ATC 
Lys 61 y Leu Pro Phe Ser Ser Phe Leu He Leu Pro Phe Gin Arg He 
245 250 255 260 



1026 



1074 



1122 



ACG AGA CTC AAG CTG CTG GTC CAG AAT ATC CTG AAG AGA GTG GAG GAG 
Thr Arg Leu Lys Leu Leu Val Gin Asn He Leu Lys Arg Val Glu Glu 
265 270 275 



1170 



AGG TCT GAA CGT GAA GGC ACC GCC TTG GAT GCC CAC AAG GAG CTA GAA 1218 
Arg Ser Glu Arg, Glu Gly Thr Ala Leu Asp Ala His Lys Glu Leu Glu 
280 285 .290 

AT6 GTG GTA AAG GCA TGC AAT GAG GGT GTC CGG AAG ATG AGC CGC ACA 1266 
Met Val Val Lys Ala Cys Asn Glu Gly Val Arg Lys Met Ser Arg Thr 
., 295 300 305 



GAA CAG ATG ATC AGC ATT CAG AAG AAG ATG GAG HC AAG ATC AAG TCG 
Glu Gin Met He Ser He Gin Lys Lys Met Glu Phe Lys He Lys Ser 
310 315 320 



1314 



GTA CCC ATC ATC TCA CAC TCC CGG TG6 CTG CTG AAG CAG GGT GAG CTG 
Val Pro He He Ser His Ser Arg Trp Leu Leu Lys Gin Gly Glu Leu 
325 330 335 340 



1362 



•CAG CAG ATG TCC GGC CCC AAG ACC TCC CGC ACC CTG CGG ACC AAG AAG 1410 
Gin Gin Met Ser Gly Pro Lys Thr Ser Arg Thr Leu Arg Thr Lys Lys 
.345 350 355 

CTC nC AGA GAA AH TAC CTC TTC CTC TTC AAT GAC CTG CTG GTG ATC 1458 
Leu Phe Arg Glu He Tyr Leu Phe Leu Phe Asn Asp Leu Leu Val He 
360 365 370 



TGC CGG CAG ATC CCT GGA GAC AAG TAC CAG GTG TTT GAT TCG GCC CCA 
Cys Arg Gin He Pro Gly Asp Lys Tyr Gin Val Phe Asp Ser Ala Pro 
375 380 385 



1506 
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AGG 6GC CTG CTT CGA GTG GAG GAG CTG GAG GAC CAG GGT CAA ACA CTG ' 1554 
Arg Gly Leu Leu Arg Val Glu GTu Leu Glu Asp Gin Gly Gin Thr Leu • 
390 395 400 

GCT AAT GTG HC ATC CTG CGG CTG CTG GAA AAT GCA GAT GAC CGA GAG ,1602 
Ala Asn Val Phe He Leu Arg Leu Leu Glu Asn Ala Asp Asp Arg Glu 
405 m 415 420. . 

GCC ACC TAT ATG CTG AAG GCA TCC TCC GAG AGC GAG . ATG AAG CGC TGG 1650 
Ala Thr Tyr Met Leu Lys Ala Ser Ser Gin Ser Glu Met Lys Arg Trp 
425 430 . 435 

ATG ACC ICA CTG GCC CCC AAC AGG AGG ACC AAG TTT GTA TCC TO ACA 1698 
Met Thr Ser Leu Ala Pro Asn Arg Arg Thr Lys Phe Val Ser Phe Thr 
440 445 450 

TCT CGG CTG HG GAC TGT CCC CAG GTC CAG TGT GTG CAC CCG TAT GTG 1746 
Ser Arg Leu Leu Asp.Cys Pro Gin Val Gin Cys Val His Pro Tyr Val 
455 460 465 

GCC CAG CAG CCT GAT GAA CTG ACQ CTG GAA CTG GCA GAT ATC CTG .AAC 1794^ 
Ala Gin Gin Pro Asp Glu Leu Thr Leu Glu Leu Ala Asp He Leu Asn 
470 475 480 

ATC CTG GAG AAG ACA GAG GAT GGT GAG CCC CGC ACC AAG GGG ACT CTG 1842 
He Leu Glu Lys Thr Glu Asp Gly Glu Pro Afg Thr Lys Gly Thr Leu ' • 
.485 490 495 ' 500 

CAT CTT GGC CAG CCA TGA GAGAGAGGAC TATGGCCTAG ATGTAGGACT 1890 
His Leu Gly Gin Pro * 
505 

AGATGGT6CA GHAGCAGGG TGGATCTITG GTGAGCGGCT GCATGACCAG 6A6AGAGGCT 1950 

GGTOCCCAG TOCATGACA 6AGGAGATCC TGAACCCCAA GATCCGCTCC CAGAACCTCA 2010 

AGGAATGTTT CCGGGTACAT A/^TGGAAG ACCCTCAGCG CAGCCAGAAT AAQ3ACC6CA 2070 

6GAAGCTGGG CAGCCGGAAT CGTCAATGAA CCTCCCCAGC TCAGGCACCT GAAGGGAAGG 2130 

GTGTGGGCAG GGATGGGGAG CAGGCCCGGC AGAGACGCCC GACAGAHCA GAGQGCCTTA 2190 

GGGAAGAATG TCAGTGCCH CTCAGGCAGC AGGAGTGGCT TCGGCCTGCT CTGTCCCTGC 2250 

CCATGCTGTG GAAGCTCTAG TGTCCTGGCC ACnGTITGC HGCACACTG GTGAAAAGCT 2310 

AAGTACHAG GCAGTATTAC ACCACCTCCC TTCAGTCTCT CAGAGGTAGA AGAAQGCAGG 2370 

CATGCTCCAG AGACCHCCG GTGACTGGAA GAGGCCCACA CAAGG6TCCC TGGCAGCAGG 2430 
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CAGGTGGMG GTAACCACTG TCAGGATCCC CTGAACTGCA CGTGTCCTTG CCTACTTTGG 2490. 

AAGCTGHAA GAGTCTACCA GGCACACAGA TGGCC6CCCC TGCCCGAGGG AGHTGATGA 2550 

GCAGTGGTGA CCCTGCCTGC CCGTCCCCGT GCCTCTGCCA 6CCTCTCTTG CACGCCAA6C 2610 

- .CCTSCCCTCA GCAGGCTTCC CA'MtG(nT>AfefC5SAGG£Tt€ ATGCC^^ "WSCTSCHG ■> 267S 

AAGGGCTTGA TATCACITGT GTCTCGTGGG CCCCTGATG6 AGCCCAGGCG nTTGCAGAA 2730 

TGAATTGGTC ACT6CATCCT HAIGGTCAT GGmTGAGA AAAGCAAATA TCATnTTGG 2790 

CTGCATTAAA AGAAGCATCC TATATAAAAA AAAAAAAAAA AAA 2833 . 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 505 amino acids 

(B) TYPE: amino acid 
(D) TOPOLiOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met He His Pro He Pro Ala Asp Ser Trp Arg Asn Leu . He Glu Gin 

1 5 . . 10 15 

He Gly Leu Leu Tyr Gin Glu Tyr Arg Asp Lys Ser Thr Leu Gin Glu 

20 25 30 . 

He Glu Thr Arg Arg Gin Gin Asp Ala Glu He Gin Gly Asn Ser Asp 
■ 35 40 . 45 

Gly. Ser Gin Val Gly Glu Asp Ala Gly Glu <alu Glu Glu Glu Glu Glu 
50 55 . 60 

Glu Gly Glu Glu Glu Glu Leu Ala Ser Pi*o Pro Glu Arg Arg Ala Leu 
65 70 75 80 

Pro Gin He Cys Leu Leu Ser Asn Pro His Ser Arg Phe Asn Leu Trp 
85 90 95 

Gin Asp Leu Pro Glu He Gin Ser Ser Gly Val Leu Asp He Leu Gin 
100 105 no 

Pro Glu Glu He Arg Leu Gin Glu Ala Met Phe Glu Leu Val Thr Ser 
115 120 125 
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Glu Ala Ser Tyr Tyr Lys Ser Leu Asn Leu Leu Val Ser His Phe Met 
\ 130 135. 140 

Glu Asn Glu Arg Leu Lys Lys lie Leu His Pro Ser Glu Ala His He 
.145 150 155 160. 

Leu. Phe Ser Asn*.Va*Meu Asp Val Met Ala Val Ser Glu Arg Phe Leu 
.165 170 * 175 

Leu Glu Leu Glu His Arg Met Glu Glu Asn He Val He Ser Asp Val 
180 185 . 190 . 

Gys Asp He Val Tyr Arg Tyr Ala Ala Asp His Phis Ser Val Tyr He 
195 200 • 205 

Thr Tyr Val Ser Asn Gin Thr Tyr Gin Glu. Arg Thr Tyr Lys Gin Leu 
210 215. -220 

Leu Gin Glu Lys Ala Ala Phe Arg Glu Leu He Ala Gin Leu Glu Leu. 
225 230 . 235 240 

Asp Pro Lys Cys Lys Gly Leu Pro Phe Ser Ser Phe Leu He Leu Pro 
245 -250 255 

Phe Gin Arg He Thr Arg Leu Lys Leu Leu Val Gin Asn He Leu Lys 
260 265 270 

Arg Val Glu Glu Arg Ser Glu Arg Glu Gly Thr Ala Leu Asp Ala His 
275 280 285 

Lys Glu Leu Glu Met Val Val Lys Ala Cys Asn Glu Gly Val Arg Lys 
^90 295 300 

Met Ser Arg Thr Glu Gin Met He Ser He- Gin lys Lys Met Glu Phe. 
305 310 315 320 

Lys He Lys Ser Val Pro He He Ser His Ser Arg Trp Leu Leu Lys 
325 330 335 

Gin Gly Glu Leu Gin Gin Met Ser Gly Pro Lys Thr Ser Arg Thr Leu 
340 345 350 

Arg Thr Lys Lys Leu Phe Arg Glu He Tyr Leu Phe Leu Phe Asn Asp 
355 360 365 

Leu Leu Val He Cys Arg Gin He Pro Gly Asp Lys Tyr Gin Val Phe 
370 375 380 

Asp Ser Ala Pro Arg Gly Leu Leu Arg Val Glu Glu Leu Glu Asp Gin 
385 390 , 395 400 
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Gly Gin Thr Leu Ala Asn Val Phe He Leu Arg Leu Leu Glu Asn Ala 
405 410 . 415 

Asp Asp Arg Glu Ala Thr Tyr Met Leu Lys Ala Ser Ser Gin Ser Glu 
420 425 430 . 

Met Lys Arg Trp Met Thr Ser Leu Ala Pro Asn Arg Arg Thr Lys Phe 
435 440 445 

Val Ser Phe Thr Ser Arg Leu Leu Asp Cys'Pro Gin Val Gin Cys Val 
450 455 460 

His Pro Tyr Val Ala Gin Gin Pro Asp Glu Leu Thr Leu Glu Leu Ala 
455 470 475 .480 

Asp He Leu Asn He Leu Glu Lys Thr Glu Asp Gly Glu Pro Arg. Thr 
485 490 495 • 

Lys Gly Thr Leu His Leu Gly Gin Pro * 
500 505 



(2) INFORMATION FOR SEQ ID NO: 5: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2343 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 

(ix) FEATURE: 

(A) NAME/KEY: CDS • 
(8) LOCATION: 2.. 1609 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

C CTG TAT CAA GAG TAT AGA GAC AAA TC6 ACT CTC CAA 6AA AH GAA 46 
Leu Tyr Gin Glu Tyr Arg Asp Lys Ser Thr Leu Gin Glu He Glu 
1 5 10 15 • 

ACA AGG CAG CAG GAT GCA GAA ATC CAA GGC AAC TCC GAT GGG TCC 94 
Thr Arg Arg Gin Gin Asp Ala Glu He Gin Gly" Asn Ser Asp Gly Ser 
20 25 30 
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CAG GH GGG GAA GAC GCT GGA GAG GAG GAG GAG GAG GAG GAG GAG GGA 
Gin Val Gly Glu Asp Ala Gly Glu Glu Glu Glu Glu Glu Glu Glu Gly 
35 40' 45 ' . 



142 



GAG GAG GAG GAG CTG GCC A6C CCT CCT GAG AG6 AGA GCT CTG CCT CAG 
Glu Glu Glu Glu Leu Ala Ser Pro Pro Glu Arg Arg Ala Lieu Pro Gin 
50 .55 60 



190 



ATC TGC CTG CTC AGC AAC CCC CAC TCC AGG TTC AAC CTC TGG CAA GAC 
lie Cys Leu Leu Ser Asn Pro His Ser Arg Phe Asn Leu Trp Gin Asp 
65 • .70 -75 



238 



CTT CCT GAG ATC CAG AGC AGT G6C GTG CTG GAC ATT CTC CAG CCG GAG 286 
Leu Pro Glu He Gin Ser Ser Gly Val Leu Asp He Leu Gin Pro Glu 
80 . 85 . 90 95 

GAG ATC AGG CTG CAG GAG GCC ATG TTT GAG TTG GTT ACC TCT GAG GCC 334 
Glu He Arg Leu Gin Glu Ala Met Phe Glu Leu Val Thr Ser Glu Ala 
100 . 105 lib 

TCC TAC TAT AAG AGC CTG AAC CTG CTG GTG TCG CAC TTC ATG GAG AAC 382 
Ser Tyr Tyr Lys Ser Leu Asn Leu Leu Val Ser His Phe Met Glu Asn 
115 120 125 



GAG CGT CTG AAG AAG ATC CTG CAT CCA TCT GAG GCC CAC ATC CTC TTT 
Glu Arg Leu Lys Lys He Leu His Pro Ser Glu Ala His He Leu Phe 
130 135 140 



.430. 



TCC AAT GTC CTG GAT GTC ATG GCT GTC AGT GAG CGG TTT HG CTG GAG 478 
Ser Asn Val Leu Asp. Val Met Ala Val Ser Glu Arg Phe Leu Leu Glu 
145 .150 155 . 

CTA GAG CAC CGC ATG GAG GAG AAC AH GH ATC TCG GAT GTG TGC GAC 526 
Leu Glu His Arg Met Glu Glu Asn He Val. He Ser Asp Val Cys Asp 
160 . 165 170 175 

ATC GTG TAC CGT TAC GCA GCT GAT CAC HC TCG GTC TAT ATC ACT TAC 574 
He Val Tyr Arg Tyr Ala Ala Asp His Phe Ser Val Tyr He Thr Tyr 
180 185 190 

GTC AGT AAC CAG ACC TAC CAG GAA AGG ACA TAC AAG CAG CTC CTA CAG 622 
Val Ser Asn Gin Thr Tyr Gin Glu Arg Thr Tyr Lys Gin Leu Leu Gin 
195 200 205 



GAG AAG GCC GCT TTC CGG GAA CTG ATC GCA CAG TTG GAG CTG GAC CCC 
Glu Lys Ala Ala Phe Arg Glu Leu He Ala Gin Leu Glu Leu Asp Pro 
210 215 220 



670 
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AAA TGC MG GGG CTG CCT TTC TCC TCC HC CTC ATC HG CCT JTC CAG 718 
Lys Cys Lys Gly Leu Pro Phe Ser Sen Phe Leu lie Leu Pro Phe Gin 
225 230 ' 235 

AGG ATC ACG AGA CTC AAG CTG CTG GTC CAG AAT ATC CTG AAG AGA GTG 766 • 

Arg He Thr Arg Leu Lys Leu Leu Val Gin Asn He Leu Lys Arg VaT 
.34,9 . , ; -245 : 25C ..£55 

GAG GAG AGG TCT GAA CGT GAA GGC ACC GCC HG GAT GCC CAC AAG GAG 814 
Glu Glu Arg Ser Glu Arg Glu Gly Thr Ala Leu Asp Ala His Lys Glu 
260 265 • 270 

CTA GAA ATG GTG GTA AAG GCA TGC AAT GAG GGT.GTC CG6 AAG ATG AGC 862 
Leu Glu Met Val Val Lys Ala Cys Asn Glu Gly Val Arg Lys Met Ser 
275 280 285. 

CGC ACA GAA. CAG ATG ATC AGC ATT CAG AAG AAG ATG GAG TTC AAG ATC 910 
Arg Thr Glu Gin Met He Ser He Gin Lys Lys Met Glu Phe Lys He 
. 290 • 295. 300 

AAG TCG GTA CCC ATC ATC TCA CAC TCC CGG TGG CTG CTG AAG CAG GGT ' 958 
Lys Ser Val Pro He He Ser His Ser Arg Trp Leu Leu Lys Gin Gly 
.305 310 315 

GAG CTG CAG CAG ATG TCC GGC CCC AAG ACC TCC CGC ACC CTG CGG ACC 1006 . 
Glu Leu Gin Gin Met Ser Gly Pro Lys Thr Ser Arg Thr Leu Arg Thr 
320 325 330 335 

AAG MG. CTC nc AGA GAA ATT TAC CJC TTC CTC TTC AAT GAC CTG CTG 1054 
Lys Lys Leu Phe Arg Glu He Tyr Leu Phe Leu Phe Asn Asp Leu Leu 
340 345 350 

GTG ATC TGC CGG CAG ATC CCT G6A GAC AAG TAC CAG GTG TTT GAT TCG 1102 
Val. He Cys Arg Gin lie Pro Gly Asp Lys Tyr Gin Val Phe Asp Ser 
355 360 365 

GCC CCA AGG GGC CTG CU CGA GTG GAG GAG CTG GAG GAC CAG GGT CAA 1150 
Ala Pro Arg Gly Leu Leu Arg Val Glu Glu. Leu Glu Asp Gin Gly Gin 
370 . 375 380 

ACA CTG GCT AAT GTG TTC ATC CTG CGG CTG CTG GAA AAT GCA GAT GAC 1198 
Thr Leu Ala Asn Val Phe He Leu Arg Leu Leu Glu Asn Ala Asp Asp 
385 390 395 

CGA GAG GCC ACC TAT ATG CTG AAG GCA TCC TCC CAG AGC GAG ATG AAG 1246 
Arg Glu Ala Thr Tyr Met Leu Lys Ala Ser Ser Gin Ser Glu Met Lys 
400 405 410 415 
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CGC TGG ATG ACC TCA CTG GCC CCC AAC AGG A6G ACC AAG.TTT GTA TCC 1294 
Arg Trp Met Thr Ser Leu Ala Pro Asn Arg Arg Thr Lys Phe Val Ser 
.420 425 430 . 

TTC ACA TCT CGG CTG HG GAC T6T CCC CAG GTC CAG T6T GTG CAC CGG - 1342 
Phe Thr Ser Arg Leu Leu Asp Cys Pro Gin Val Gin Cys Val His Pro 

435 )* 440 445 . . ■ , 

TAT GTG GCC CAG CAG CCT GAT GAA CTG ACG CTG GAA CTG GCA GAT ATC 1390 
Tyr Val Ala Gin Gin Pro Asp Glu Leu Thr Leu Glu Leu Ala Asp lie 
450 455 .460 

CTG AAC ATC CTG GAG AAG ACA GAG GAT GGG TGG ATC TIT GGT GAG CGG 1438 
Leu Asn He Leu Glu Lys Thr Glu Asp Gly Trp He Phe Gly Glu Arg 
465 470 475 . 

CTG CAT GAC CAG GAG AGA GGC TGG TTC CCC AGT TCC ATG ACA GAG GAG 1486 
Leu His Asp Gin Glu. Arg Gly Trp Phe Pro Ser Ser Met Thr Glu Glu 
480 485 490. 495 . 

ATC CTG AAC CCC AAG ATC CGC TCC CAG AAC CTC AAG GAA TGT HC CGG 1534. 
He Leu Asn .Pro Lys He Arg Ser Gin Asn Leu Lys Glu Cys Phe Arg 
500 505 510 

,GTA CAT AAG ATG GAA GAC CCT CAG CGC AGC CAG AAT AAG GAC CGC AGG 1582. 
Val His Lys Met Glu Asp Pro Gin Arg Ser Gin Asn Lys Asp Arg Arg 
515 520 .525 

AAG CTG GGC AGC CGG AAT CGT CAA TGA ACCTCCCCAG CTCAGGCACC 1629 
Lys Leu Gly Ser Arg Asn Arg Gin * 
530 - . 535 



TGAA6G6AA6 


G6TGTGGGCA 


GGGATGGGGA 


GCA6GCCCGG 


CAGAGACGCC 


CGACAGAHC 


1689 


AGAGGGCCn 


AGGGAAGAAT 


GTCAGTGCCT 


TCTCAGGCAG 


CAGGAGTGGC 


TTCGGCCTGC 


1749 


TCTGTCCCT6 


CCCATGCTGT 


GGAAGCTCTA 


GTGTCCTGGC 


CACTTGITTG 


CHGCACACT 


1809 


GGTGAAAAGC 


TAAGTACHA 


GGCAGTATTA 


CACCACCTCC 


OTCAGTaC 


TCAGAGGTAG 


1869 


AAGAAGGCAG 


GCATGCTCCA 


GAGACCTTCC 


GGTGACTGGA 


AGAGGCCCAC 


ACAAQGGTCC 


1929 


CTGGCAGCAG 


GCAGGTGGAA 


GGTAACCACT 


GTCAGGATCC 


CCTGAACTGC 


ACGTGTCCn 


1989 


CCCTACTTTG 


GAAGCTGHA 


AGAGTCTACC 


AGGCACACAG 


ATGGCCGCCC 


CTGCCCGAGG 


2049 


GAGITTGATG 


AGCAGTGGTG 


ACCCTGCCTG 


CCCGTCCCCG 


TGCCTCTGCC 


AGCCTCTCTT 


2109 


GCACGCCAAG 


cccTGccac 


AGCAGGCTTC 


CCAAAGCHA 


GCTGAGGGH 


CATGCCACCT 


2169 
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CTAGCTCCTT .GAAGGGCTTG ATATCACTTG TGTCTCCTGG GCCCCTGATG GAGCCCAGGC 2229 
GTTTTGCAGA ATGAATTG6T CACTGCATCC HTAIGGTCA TGGmTGAG AAAAGCAAAT 2289 
ATCATTTTTG GCTGCAHAA AAGAAGCATC CTATATAAAA AAAAAAAAAA AAAA 2343 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: , 

(A) LENGTH: 535 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Leu Tyr Gin Glu.Tyr Arg Asp Lys Ser Thr Leu Gin Glu He Glu Thr' 
r 5 10 15 

Arg Arg Gin Gin Asp Ala Glu He Gin Gly Asn Ser Asp Gly Ser Gin 
20 25 30 

Val Gly Glu Asp Ala Gly Glu Glu Glu Glu Glu Glu Glu Glu Gly Glu 
35 40 45 

Glu Glu Glu Leu Ala Ser Pro Pro Glu Arg Arg Ala Leu Pro Gin He 
50 . 55 60 

Cys Leu Leu Ser Asn Pro His Ser Arg Phe Asn Leu Trp Gin Asp Leu. 
65 70 75 80 

Pro Glu He Gin Ser Ser Gly Val Leu Asp He Leu Gin Pro Glu Glu 

85 •. . , 90 . - 95 

He Arg Leu Gin Glii Ala Met Phe Glu Leu Val Thr Ser Glu Ala Ser 
100 105 . 110 

Tyr Tyr Lys Ser Leu Asn Leu Leu Val Ser His Phe Met Glii Asn Glu 
115 120 125 • 

Arg Leu Lys Lys He Leu His Pro Ser Glu Ala His He Leu Phe Ser 
130 135 140 

Asn Val Leu Asp Val Met Ala Val Ser Glu Arg Phe Leu Leu Glu Leu 
145 150 155 160 

Glu His Arg Met Glu Glu Asn He Val He Ser. Asp Val Cys Asp He 
165 170 175 
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Val Tyr Arg Tyr Ala Ala Asp His Phe Ser Val Tyr He Thr Tyr Val 
180 185 . 190 

Ser Asn Gin Thr Tyr Gin Glu Arg Thr Tyr Lys Gin Leu Leu Gin Glu 
195 * . . 200 . 205 

Lys Ala Ala Phe Arg Glu Leu He Ala Gin Leu Glu Leu Asp Pro Lys 
210 .215 220 

Cys Lys Gly Leu Pro Phe Ser Ser Phe Leu He Leu Pro Phe Gin Arg 
225 230 235 240 

He Thr Arg Leu Lys Leu Leu Val Gin Asn He Leu Lys Arg Val Glu 
245 250 . 255 

Glu Arg Ser Glu Arg Glu Gly Thr Ala Leu Asp Ala His Lys Glu Leu 
260 265 270 

Glu Met Val Val Lys. ATa Cys Asn Glu Gly Val Arg Lys Met Ser Arg 
275 280 285 

Thr Glu Gin Met He Ser He Gin Lys Lys Met Glu Phe Lys He Lys 
290 295 300 

Ser Val Pro He He Ser His Ser Arg Trp Leu Leu Lys Gin Gly Glu 
305 ■ 310 315 320 

Leu Gin Gin Met Ser Gly Pro Lys Thr Ser Arg Thr Leu Arg Thr. Lys 
325 330 335 

Lys Leu Phe Arg Glu He Tyr Leu Phe Leu Phe Asn Asp Leu Leu Val 
340 345 350 

He Cys Arg Gin He Pro Gly Asp Lys Tyr Gin Val Phe Asp Ser Ala 
355 360 365 

Pro Arg Gly Leu Leu Arg Val Glu Glu Leu Glu Asp Gin Gly Gin Thr 
370 375 . 380 

Leu Ala Asn Val Phe He Leu Arg Leu Leu Glu Asn Ala Asp Asp Arg 
385 390 395 400 

Glu Ala Thr Tyr Met Leu Lys Ala Ser Ser Gin Ser Glu Met Lys Arg 
• 405 410 . 415 

Trp Met Thr Ser Leu Ala Pro Asn Arg Arg Thr Lys Phe Val Ser Phe 
420 425 430 
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. Thr Ser Arg Leu Leu Asp Cys Pro Gin Val Gln.Cys Val His Pro Tyr 
435 440 445 

Val Ala Gin Gin Pro Asp Glu Leu Thr Leu Glu Leu Ala Asp lie Leu 
450 455 460 

■ Asn He teu Glu Lys thr' Glu' ASp'tily Trp lie Phe Gly Glu Arg Leu 
465 470 475 480. 

His Asp Glri Glu Arg Gly Trp Phe Pro Ser Ser Met Thr Glu Glu He 
485 490 495 

Leu Asn Pro Lys Me Arg Ser Gin Asn Leu Lys Glu Cys Phe Arg Val 
500 505 510 

His Lys Met Glu Asp Pro Gin Arg Ser Gin Asn Lys Asp Arg Arg Lys 
515 520 525 . 

Leu Gly Ser Arg Asn Arg Gin * 

530 . 535 . 



(2) INFORMATION FOR SEQ ID NO: 7: * 

. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 803 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME/KEY: CDS 
(8) LbCATI0N:3..8Q3 



(x1) SEQUENCE DESCRIPTION: SEQ ID NO: 7: . 

GG AGA GCT CTG CCT CAG ATC TGC CTG CTC AGT AAC CCC. CAC TCA AGG 47 
Arg Ala Leu Pro Gin He Cys Leu Leu Ser Asn Pro His Ser Arg 
1 . .5 10 15 

nC AAC CTC TGG CAG GAT ' CH CCC GAG ATC CGG AGC AGC GGG GTG CTJ 95 
Phe Asn Leu Trp Gin Asp Leu Pro Glu He Arg Ser Ser Gly Val Leu 
. 20 25 30 
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GAG ATC CTA CAG CCT GAG GAG ATT AAG CTG CAG GAG GCC ATG TTC GAG 143 
Glu He Leu Gin Pro Glu Glu He Lys Leu Gin Glu Ala Met Phe Glu 
35 .40 45 

CTG GTC ACT TCC GAG GCG TCC TAC TAC AAG AGT CTG AAC CTG CTC 6TG 191 
Leu Val Thr Ser Glu Ala Sen Tyr Tyr Lys Sen Leu Asn Leu Leu Val 
50 55 60 



TCC CAC TTC ATG GAG AAC GAG CGG ATA AG6 AAG ATC CTG CAC CCG TCC 
Ser His Phe Met Glu Asn Glu Arg He Arg Lys He Leu His Pro Ser 
65. 70 75 • 



239 



GAG GCG CAC ATC CTC TIC TCC AAC GTC CTG GAC GTG CTG GCT GTC AGT 
Glu Ala His He Leu Phe Ser Asn Val Leu Asp Val Leu Ala Val Ser 
80 . 85 . 90 95 



287 



GAG CGG TTC CTC CTG GAG CTG GAG CAC CGG ATG GAG GAG AAC ATC GTC 
Glu Arg Phe Leu Leu Glu Leu Glu His Arg Met Glu Glu Asn He Val 

100 . 105 no 



335 



ATC TCT GAC GTG TGT GAC ATC GTG TAC CGT TAT GCG GCC GAC CAC TTC 383 
He Ser Asp Val Cys Asp He Val Tyr Arg Tyr Ala Ala Asp His Phe 
115 120 125 

TCT GTC TAC ATC ACC TAC GTC AGC AAT CAG ACC TAC CAG GAG CGG ACC : '431 
Ser Val Tyr He Thr Tyr Val Ser Asn Gin Thr Tyr Gin Glu Arg Thr 
130 135 140 



TAT AAG CAG CTG CTC CAG GAG AAG GCA GCT HC CGG GAG CTG ATC GCG 
Tyr Lys Gin Leu Leu Gin Glu Lys Ala Ala Phe Arg Glu Leu He Ala 
145 150 . 155 , •" 



479 



CAG CTA GAG CTC GAC CCC AAG TGC AGG GGG CTG CCC HC TCC . TCC TTC 
Gin Leu Glu Leu Asp Pro Lys Cys Arg Giy Leu Pro Phe Ser Ser Phe 
160 . .165 170 175 



527 



CTC ATC CTG CCT TTC CAG AGG ATC ACA CGC CTC AAG CTG TT6 GTC CAG 
Leu He Leu Pro Phe Gin Arg He Thr Arg Leu Lys Leu Leu Val Gin 
180 185 190 



575 



AAC ATC CTG AAG AGG GTA GAA GAG AGG TCT GAG CGG GAG TGC ACT GCT 
Asn He Leu Lys Arg Val Glu Glu Arg Ser Glu Arg Glu Cys Thr Ala 
195 200 205 



623 



TTG GAT GCT CAC AAG GAG CTG GAA ATG GTG GTA AAG GCA TGC AAC GAG 
Leu Asp Ala His Lys Glu Leu Glu Met Val Val Lys Ala Cys Asn Glu 
210 215 220 



671 
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. GGC GTC AGG AAA ATG AGC CGC ACG GAA CAG ATG ATC A6C AH CAG AAG 719 
Gly Val Arg Lys Met Ser Arg Thr Glu Gin Met He Ser He Gin Lys 
225 230 235 



AAG ATG GAG HC AAG ATC AAG TCG GTG CCC 
Lys Met Glu Phe Lys He Lys Ser Val Pro 
•'241) . • * ^^4S 



ATC ATC TCC CAC TCC CGC . . 767 
He He Ser His Ser Arg 
250 • '255- 



TGG CTG CTG AAG CAG GGT GAG CTG CAG CAG ATG TCC 803 
Trp Leu Leu Lys Gin Gly Glu Leu Gin Gin Met Ser 
260 . 265 



(2) INFORMATION FOR SEQ 10 NO: 8: . 

(i) SEQUENCE CHARACTERISTICS: 
„ (A) LENGTH: 267 amino acids , 
(B) TYPE: amino acid 
. (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(x1) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Arg Ala Leu Pro Gin He Cys Leu Leu Ser Asn Pro His Ser Arg Phe 
1 5 10 15 

Asn Leu Trp Gin Asp Leu Pro Glu He Arg Ser Ser Gly Val Leu Glu 
20 , 25 30 

He Leu Gin Pro Glu Glu He Lys Leu Gin Glu Ala Met Phe Glu Leu 
35 40 45 . 

Val Thr Ser Glu Ala Ser Tyr tyr Lys Ser Leu Asn Leu. Leu Val Ser 
50 55 60 

His Phe Met Glu Asn Glu Arg He Arg Lys He Leu His Pro Ser Glu 
65 70 75 80 

Ala His He Leu Phe Ser Asn Val Leu Asp Val Leu Ala Val Ser Glu 
85 90 . 95 

Arg Phe Leu Leu Glu Leu Glu His Arg Met Glu Glu Asn He Val He 
100 105 , 110 

Ser Asp Val Cys Asp He Val Tyr Arg Tyr Ala Ala Asp His Phe Ser 
115 120 125 

Val Tyr He Thr Tyr Val Ser Asn Gin Thr Tyr Gin Glu Arg Thr Tyr 
130 135 140 
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Lys 61n Leu Leu Gin Glu Lys Ala Ala Phe Arg 61u Leu He Ala Gin ' 
145 .150 155 160 

Leu Glu Leu Pro Lys.Cys Arg Gly Leu Pro Phe Ser Ser Phe Leu 
165 170 175 

He Leu Pro Phe GlVi Arg He Thr Arg Leu Lys Leu Leu Val Gin Asn 
180 . 185 . 190 

He Leu Lys Arg Val Glu Glu Arg Ser Glu Arg Glu Cys Thr Ala Leu 
.195 . 200 205 

Asp Ala His Lys Glu Leu Glu Met Val Val Lys Ala Gys Asn Glu Gly 
210 215 . 220 

Val Arg Lys Met Ser Arg Thr Glu Gin Met He Ser He Gin Lys Lys 
225 230 235 240 

Met Glu Phe Lys He Lys Ser Val Pro He He Ser His Ser Arg Trp 
245 . 250 255 

Leu Leu Lys Gin Gly Glu Leu Gin Gin Met Ser 
260 • :265 
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OAIMS 

1 . A polynucleotide encoding murine guanine nucleotide exchange factor (MNGEF) or 
a homologue thereof. 

2. A polynucleotide according to claim 1 wherein said homologue is human guanine 
nucleotide exchange factor (NGEF). 

3. . A polynucleotide selected from: 

(a) polynucleotides comprising the nucleotide sequence set out in SEQ ID No. 
i , 3, 5 or 7 or the complement thereof 

(b) polynucleotides comprising a nucleotide sequence capable of hybridising to 
the nucleotide sequence set out in SEQ ID No. 1, 3, 5 or 7, or a fragment . 
thereof. 

(c) polynucleotides comprising a nucleotide sequence capable of hybridising to 
the complement of the nucleotide sequence set out in SEQ ID No. 1,3,5 or 

i 7, or a fragment thereof 

(d) polynucleotides comprising a polynucleotide sequence which is degenerate 
as a result of the genetic code to the polynucleotides defined in (a), (b) or (c). 

4. A polynucleotide probe which comprises a fragment of at least 15 nucleotides of a 
polynucleotide as defined in any one of claims I to 3. 

5. A polypeptide in substantially isolated form which comprises the sequence set out in 
SEQ ID Nos. 2, 4, 6 or 8, or a polypeptide substantially homologous thereto, or a 
fiagment of the polypeptide of SEQ ID Nos. 2, 4, 6 or 8. 

6. A polynucleotide encoding a polypeptide according to claim 5. 

7. A vector comprising a polynucleotide as defined in any one of claims 1 to 3 or 6. 

SUBSTITUTE SHEET (RULE 26) 



wo 98/23743 



PCT/GB97/03302 



' . ■ .44- 

8 . An expression vector comprising a polynucleotide as defined in any one of claims I 
to 3 or 6, operably linked to regulatory sequences capable of directing expression of 
said polynucleotide in a host cell. 

9. An antibody capable of binding the polypeptide of SEQ ID., No. 2, 4, 6 or 8 or 
, fiagment thereof. . 

10. A method for detecting the presence or absence of a polynucleotide as defined in any 
one of claims 1 to 3 or 6 in a biological sample which comprises: 

(a) bringing the biological sample containing DNA or RNA into contact with a 
probe according to claim 4 under hybridising conditions; and 

(b) detecting any duplex formed between the probe and nucleic acid in the sample. 

11. A mediod of detecting polypeptides as defined in claim 5 present in biological 
samples which comprises: 

(a) providing an antibody according to claim 9; 

(b) incubating a biological sample with said antibody under conditions which 
allow for the formation of an antibody-antigen complex; and 

(c) determining whether antibody-antigen complex comprising said antibody is 

formed. 
• . ■ 

12. A polynucleotide according to any one of claims 1 to 3 or 6 for use in a method of 
treatment of the human or animal body. 

13. A polypeptide according to claim 5 for use in a method of treatment of the human or 
animal body. 

14. An antibody according to claim 1 0 for use in a method of treatment of the human or 
animal body. 

15. A method of treating a disease or disorder of the nervous system, comprising 
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administering an effective amount of a polynucleotide as defined in any one of 
claims 1 to 3 or 6, to a patient. 

16. A method of treating a disease or disorder of the nervous system, comprising 
, administering an effective amount of a polypeptidetide as defined in claim 5, to a 
patient 

.17. A method of treating a disease or disorder of the nervous system, comprising 
administering an effective amount of an antibody as defmed in claim 10 to a patient. 

18. The method of claim 1 5, 16 or 17 wherein said disease or disorder is a malignancy. 
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